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Abstract 

Flash memory is widely used as the secondary storage in lightweight computing devices due 
to its outstanding advantages over magnetic disks. Flash memory has many access characteristics 
different from those of magnetic disks, and how to take advantage of them is becoming an important 
research issue. There are two existing approaches to storing data into flash memory: page-based and 
log-based. The former has good performance for read operations, but poor performance for write 
operations. In contrast, the latter has good performance for write operations when updates are light, 
but poor performance for read operations. In this paper, we propose a new method of storing data, 
called page- differential logging, for flash-based storage systems that solves the drawbacks of the two 
methods. The primary characteristics of our method are: (1) writing only the difference (which we 
define as the page-differential) between the original page in fiash memory and the up-to-date page in 
memory; (2) computing and writing the page-differential only once at the time the page needs to be 
reflected into flash memory. The former contrasts with existing page-based methods that write the 
whole page including both changed and unchanged parts of data or from log-based ones that keep 
track of the history of all the changes in a page. Our method allows existing disk-based DBMSs 
to be reused as flash-based DBMSs just by modifying the flash memory driver, i.e., it is DBMS- 
independent. Experimental results show that the proposed method is superior in I/O performance, 
except for some special cases, to existing ones. Speciflcally, it improves the performance of various 
mixes of read-only and update operations by 0.5 (the special case when all transactions are read- 
only on updated pages) ~ 3.4 times over the page-based method and by 1.6 ~ 3.1 times over the 
log-based one for synthetic data of approximately 1 Gbytes. The TPC-C benchmark also shows 
improvement of the I/O time over existing methods by 1.2 ~ 6.1 times. This result indicates the 
effectiveness of our method under (semi) real workloads. 
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1 Introduction 



Flash memory is a non- volatile secondary storage that is electrically erasable and reprogrammable [41IT0]. 
Flash memory has outstanding advantages over magnetic disks: lighter weight, smaller size, better shock 
resistance, lower power consumption, and faster access time |10 ( fM j I25 j . Due to these advantages, the 
flash memory is widely used in embedded systems and mobile devices such as mobile phones, MPS 
players, and digital cameras [T4l [T5] . 



Flash memory is much different from a magnetic disk in structures and access characteristics |12j . 
It is composed of a number of blocks, and each block is composed of a fixed number of pages. It 
does not have seek and rotation latency because it is made of electronic circuits without mechanically 
moving parts ^12j. Flash memory provides three kinds of operations — read, write, and erase. In order 
to overwrite existing data in a page, an erase operation must be performed before writing new data 
on the page [121 US] • The write and erase operations are much slower than the read operation [T4j [18] . 
Besides, the unit of the erase operation is a block, while the unit of the read and write operations is a 
page [25] . 

There have been a number of studies [21 |3l El [131 El EI] on the method of storing updated pages 
into flash memory for flash-based storage systems. In this paper, we refer to such methods as page 
update methods. The page update methods are classified into two categories [25] — page-based [3] [T3] 
and log-based [2 [Ml [21] . Page-based methods write the whole page into flash memory when an updated 
page needs to be reflected into flash memory (e.g., when the page is swapped out from the DBMS buffer 
to the database) [S] I13[ 125) . These methods actually read only one page when recreating a page from 
flash memory (e.g., reading it into a DBMS buffer). Thus, they have good read performance. However, 
they have relatively poor write performance because they write the whole page including unchanged 
parts as well as changed parts of data [25]. In order to overcome this drawback, log-based methods have 
been proposed [25| . These methods write only the changes (which we call an update lo(^) in the page 
into the write buffer, which in turn is written into flash memory when the buffer is full plll4 [ [2T j . Thus, 



^An update log contains the changes in a page resulted in a single update command. 
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compared with page-based methods, log-based ones have good write performance when updates are not 
heavynPS]. Log-based methods, however, have relatively poor read performance because they keep the 
history of all the changes (i.e., multiple update logs) in a page. Whenever an update is done, they write 
an update log into the write buffer. Thus, when updates are done multiple times, the update logs are 
likely to be written into multiple pages in flash memory. Thus, log-based methods need to read multiple 
pages when recreating a page from flash memory. 

In this paper, we propose a page update method called page-differential logging (PDL) for flash- 
based storage systems. A page-differential (simply, a differential) is defined as the difference between 
the original page in the flash memory and the up-to-date page in memory. This novel method is much 
different from page-based methods or log-based ones in the following ways. (1) We write only the 
differential of an updated page. This characteristic stands in contrast with page-based methods that 
write the whole page including changed and unchanged parts of data or log-based ones that keep track 
of the history of all the changes (i.e., multiple update logs) in a page. Furthermore, we compute and 
write the differential only once at the time the updated page needs to be reflected into flash memory. 
The overhead of generating the differential is relatively minor because, in flash memory, the speed of 
read operation is much faster than those of write or erase operations. (2) When recreating a page 
from flash memory, we need fewer read operations than log-based ones do because we read at most two 
pages: the original page and the single page containing the differential. (3) When we need to reflect an 
updated page into flash memory, we need fewer write operations than others do because we write only 
the differential. A side benefit is that the longevity of fiash memory is also improved due to fewer erase 
operations resulted from fewer write operations. (4) Our method is loosely-coupled with the storage 
system while the log-based ones are tightly-coupled. The log-based methods need to modify the storage 
management module of the DBMS because they must identify the changes in a page whenever it is 
updated. These changes can be identified only inside the storage management module because they 
are internally maintained by the system. On the other hand, our method does not need to modify the 
module of the DBMS because it computes the differential outside the storage management module by 



^ When pages are frequently updated, the log-based methods could be poorer in performance as we see in the experi- 
ments in p. 30, Figure [THl 
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comparing the page that needs to be reflected with the original page in the flash memory. We elaborate 
on this point later in Section 21 

The contributions of this paper are as follows. (1) we propose a new notion of "differential" of a 
page. Using this notion, we then propose a new approach to updating pages that we call page- differential 
logging. (2) Our method is DBMS-independent. (3) Through extensive experiments, we show that the 
overall read and write performance of our method is mostly superior to those of existing ones. 

Hereafter, in order to reduce ambiguity in this paper, we distinguish logical pages from physical 
pages. We call the pages in memory logical pages and the ones in flash memory physical pages. For ease 
of exposition, we assume that the size of a logical page is equal to that of a physical page. 

The rest of this paper is organized as follows. Section[2]introduces flash memory. Section|3]describes 
prior work related to the page update methods for flash-based storage systems. Section |4] presents a 
new page update method called page- differential logging. Section [5] presents the results of performance 
evaluation. Section [5] summarizes and concludes the paper. 

2 Flash Memory 

Based on the structure of memory cells, there are two major types of flash memory [6i : the NAND type 
and the NOR type. The former is suitable for storing data, and the latter for storing code [16|. In the 
rest of this paper, we use the term 'flash memory' to indicate the NAND type flash memory, which is 
widely used in flash-based storage systemsjf . 

Figure [1] shows the structure of flash memory. The flash memory consists of Nuock blocks^ and each 
block consists of Npage pages. A page is the smallest unit of reading and writing data, and a block is the 
smallest unit of erasing data [5S] . Each page consists of a data area used for storing data and a spare 
area used for storing auxiliary information such as the valid bit, obsolete bit, bad block identification, 
and error correction check (ECC) [16] . 

^ In this paper, we focus on flash memory but not on soUd state disks (SS-D's) 19. . which have controllers with their 
own page update methods. 
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Figure 1 . The structure of flash memory. 



We consider three operations: read, write, and erase [6]. 



• The read operation : returns all the bits in the addressed page 



• The write operation : changes a set of bits selected in the target page from 1 to 



• The erase operation : sets aU the bits in the addressed block to 1 

The operations in flash memory are different from those in the magnetic disk in two ways. First, aU the 
bits in flash memory are initially set to 1. Thus, writing to flash memory means selectively changing 
some bits in a page from 1 to 0. Next, the erase operation in flash memory changes the bits in a block 
back to 1. Each block can sustain only a limited number of erase operations before becoming unreliable, 
which is restricted to about lOO.OOoH p^ [TS] . 

Due to the restriction of the write and erase operations, a write operation is usually preceded by 
an erase operation in order to overwrite a page [TH [Tl] . We first change all the bits in the block to 1 
using an erase operation, and then, change some bits in the page to using a write operation. We note 
that the erase operation is performed in a much larger unit than a write operation, i.e., the former is 
performed on a block while the latter on a page. The specific techniques for overwriting a page depend 
on the page update method employed. These techniques are discussed in Section [31 

Based on the capacity of memory cells, there are two types of fiash memory |12): Single Level 
Cell (SLC)-type and Multi Level Cell (MLC)-type. The former is capable of storing one data bit per 



* Due to this characteristic, there have been a number of studies on wear-leveling | 10| and bad block management 1161 . 
However, we do not address them in this paper, but these studies can be applied to the storage system independently of 
the page update methods discussed in this paper. 



cell, while the latter is capable of storing two (or even more) data bits per cell. Thus, MLC-type flash 
memory has greater capacity than SLC-type one and is expected to be widely used in high-capacity 
flash storages [12]. Table [1] summarizes the parameters and values of MLC flash memory we use in our 
experiments. We note that the size of a page is 2,048 bytes, and a block has 64 pages. In addition, the 
access time of operations increases in the following order: read, write, and erase. The read operation is 
9.2 times faster than the write operation, which is 1.5 times faster than the erase operation. 



Table 1 . The parameters and values of flash memory* . 



Symbols 


Definitions 


Values 


Nblock 


the number of blocks 


32,768 


^ ^ page 


the number of pages in a block 


64 


Shlock 


the size of a block (bytes) {— Npage x Spage) 


135,168 (64 X 2,112) 


Spage 


the size of a page (bytes) (= Sdata + Sgpare) 


2,112 (= 2,048 + 64) 


Sdata 


the size of data area in a page (bytes) 


2,048 


S spare 


the size of spare area in a page (bytes) 


64 


Tread 


the read time for a page (ps) 


110 


Twrite 


the write time for a page (ps) 


1010 


T 

erase 


the erase time for a block (/is) 


1500 



* Samsung K9L8G08U0M 2 Gbytes MLC NAND flash memory [18] 



3 Related Work 

The Page-Based Approach 

In page-based methods [3j [13], a logical page is stored into a physical page. When an updated 
logical page needs to be reflected into flash memory, the whole logical page is written into a physical 
page |25| . When a logical page is recreated from flash memory, it is read directly from a physical page. 
These methods are loosely-coupled with the storage system because they can be implemented in a 
middle layer, called the Flash Translation Layer (FTL) [3], which maintains logical-to-physical address 
mapping between logical and physical pages as shown in Figure [51 The FTL can be implemented as 
hardware in the controller residing in SSD's, or can be implemented as software in the operating system 
for embedded boards[£. 

^Commercial FTL's for SSD's or embedded boards typically use page-based methods [l] 
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Figure 2. The architecture of the page-based method. 

In page-based methods, there are two update schemes [15| — in-place update and out-place up- 
date — depending on whether or not the logical page is always written into the same physical page. 
When a logical page needs to be reflected into flash memory, the in-place update overwrites it into 
the speciflc physical page that was read [15] , but the out-place update writes it into a new physical 
page m Eg. 

In-Place Update: As explained in Section [2l the write operation in flash memory cannot change bits 
in a page to 1. Therefore, when overwriting the logical page li that was read from the physical page pi 
in the block bi into the same physical page pi, we do the following four steps: (1) read all the pages 
in bi except pi] (2) erase 6i; (3) write li into pi] (4) write all the pages read in Step (1) except li in 
the corresponding pages in bi. The in-place update scheme suffers from severe performance problems 
and is rarely used in flash memory [15 because it causes an erase operation and multiple read and write 
operations whenever we need to reflect a logical page into flash memory. 

Out-Place Update: Figure |3] shows a typical example of the out-place update scheme. Figure |3](a) 
shows the logical page li read from the physical page pi in the block bi. Figure |3](b) shows the updated 
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logical page li and the two physical pages pi and p2 — the original page read and the new page written. 
In order to overcome the drawback of in-place update, when we need to reflect the logical page li into 
flash memory, the out-place update scheme first writes li into a new physical page p2, and then, sets 
Pi to obsoletcLl. When there is no more free page in flash memory, a block is selected and obsolete 
pages in it are reclaimed by garbage collection^, which converts obsolete pages to free pages. The 
out-place update scheme is widely used in flash-based storage systems [25 because it does not cause an 
erase operation when a logical page is to be reflected into flash memory. 
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(a) The logical page h read from 
the physical page pi . 



(b) The updated logical page li and 
the process of writing it into the physical page p2- 



Figure 3. An example of out-place update. 



The Log-Based Approach 

In log-based methods [3J [T3J [3T] , a logical page is generally stored into multiple physical pages [T3] . 
Whenever logical pages are updated, the update logs of multiple logical pages are first collected into a 
write buffer in memory |25) . When this buffer is full, it is written into a single physical page. Thus, 
when a logical page is updated many times, its update logs can be stored into multiple physical pages. 
Accordingly, when recreating a single logical page, multiple physical pages may need to be read and 

''We set a page to obsolete by changing the obsolete bit in the spare area of the page from 1 to as in Gal et al. [6]. 
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merged. The log-based methods are tightly-coupled with the storage system because the storage system 
must be modified to be able to identify the update logs of a logical page. 

Among log-based methods, there are Log-structured File system (LFS) |17| . Journaling Flash File 
System {JFFS) [21], Yet Another Flash File System ( YAFFS) [2^, and In-Page Logging {LPL) [l^. In LFS, 
JFFS, and YAFFS, the update logs of a logical page can be written into arbitrary log pages in flash 
memory while, in IPL, the update logs should be written into specific log pages. IPL divides the pages in 
each block into a fixed number of original pages and log pages. It writes the update logs of a logical page 
into only the log pages in the block containing the original (physical) page of the logical page. Therefore, 
when recreating the logical page, IPL reads the original page and only the log pages in the same block. 
When there is no free log page in the block, IPL merges the original pages with the log pages in the block, 
and then, writes the merged pages into pages in a new block (this process is called merging .14^ ). The 
old block is subsequently erased and garbage-collected. Consequently, IPL improves read performance 
by reducing the number of log pages to read from fiash memory when recreating a logical page because 
log pages do not increase indefinitely (i.e., is bound) due to merging. The performance of IPL is similar 
to other log-based methods since IPL inherits the advantages and drawbacks of log-based methods other 
than the effect of merging and bound read performance. 

Figure m shows a typical example of the log-based methods. Figure |4](a) shows the logical pages li 
and I2 in memory. Figure |3](b) shows the update logs qi and q2 of logical pages li and /2, respectively, 
and the process of writing them into flash memory. Here, the update logs qi and (72 are first written 
into the write buffer, and then, the content of the write buffer is written into the log page p^. Thus, 
the update logs qi and 92 are collected into the same log page p^. Figure |3](c) shows a similar situation 
for the update logs 93 and (74 of logical pages li and l2- Figure |4](d) shows the logical page li being 
recreated from flash memory. Here, li is recreated by merging the original page pi with the update logs 
qi and 93 read from the log pages p^ and p4, respectively. 
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(b) The update logs qi and q2 of logical pages h and I2, and 
the process of writing them into the log page in flash memory. 
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(c) The update logs 53 and 54 of logical pages li and I2, 
and the process of writing them into the log page p4 in flash memory. 
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(d) The logical page li being recreated from flash memory. 
Figure 4. An example of the log-based approach. 

4 The Page-Differential Logging Approach 

In this section, we propose page- differential logging {PDL) for flash-based storage systems. Section 14.11 
explains the design principles, and then, presents PDL, which conforms to these principles. Section [4.21 
and 14.31 present the data structures and algorithms. Section HjH discusses the strengths and limitations. 

4.1 Design Principles 

We identify three design principles for PDL in order to guarantee good performance for both read and 
write operations. These principles overcome the drawbacks of both the page-based methods and the 
log-based methods in the following ways. 

• writing difTerence only : We write only the difference when a logical page needs to be reflected 
into flash memory. 

• at-most-one-page writing : We write at most one physical page when a logical page needs to 
be reflected into flash memory even if the page has been updated in memory multiple times. 

• at-most-two-page reading : We read at most two physical pages when recreating a logical page 
from flash memory. 
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Page-differential logging method conforms to these three design principles. In this method, a logical 
page is stored into two physical pages — a base page and a differential page. Here, the base page contains 
a whole logical page, which could be the old version, and the differential page contains the difference 
between the base page and the up-to-date logical page. A differential page can contain differentials of 
multiple logical pages. Thus, the differentials of two logical pages could be stored in the same differential 
page. 

The differential has the following advantages over the list of update logs in the log-based methods. 
(1) It can be computed without maintaining all the update logs, i.e., it can be computed by comparing 
the updated logical page with its base page only when the updated logical page needs to be reflected 
into flash memory. (2) It contains only the difference from the original page for the part that has been 
updated multiple times in a logical page. When a specific part in a logical page is updated in memory 
multiple times, the list of update logs contains all the history of changes while the differential contains 
only the difference between original data and the up-to-date data. For instance, let us assume that a 
logical page is updated in memory twice as follows: ... aaaaaa ... ... bbbbba ... ... bcccba .... Here, the 

list of update logs contains two changes bbbbb and ccc while the differential contains only the difference 
bcccb. 

In PDL, when an updated logical page needs to be reflected into flash memory, we create a differ- 
ential by comparing the logical page with the base page in flash memory, and then, write the differential 
into the one-page write buffer, which is subsequently written into flash memory when it is full. Therefore, 
it conforms to the writing-difference-only principle. 

We note that, when a logical page is simply updated, we just update the logical page in memory 
without recording the log. Instead, we defer creating and writing the differential until the updated 
logical page needs to be reflected into flash memory. Thus, our method satisfies the at-most-one-page 
writing principle. 

Theoretically, the size of the differential cannot be larger than that of one page. However, prac- 
tically, it could be larger if a large part of the page has been updated. This case can occur since the 
differential contains not only the changed data but also the meta data such as offsets and lengths. In 
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this case, we discard the created differential and write the updated logical page itself into flash memory 
as a new base page in order to satisfy the at-most-one-page writing principle. (In this special case, 
PDL becomes the same as the page-based method.) 

When recreating a logical page from flash memory, we read the base page and its corresponding 
differential page, and then, merge the base page with its differential in the differential page. However, we 
need to read only one physical page if the base page has not been updated (i.e., there is no differential 
page). Thus, we need to read at most two physical pages, and accordingly, PDL conforms to the 
at-most-two-page reading principle. 

When there is no more free page in flash memory, obsolete pages are reclaimed by garbage collec- 
tion. Here, we select one block for garbage collection. Since it may contain valid base or differential 
pages, before erasing the block, we move those valid pages into a new block, which is reserved for the 
garbage collection process [6\ For differential pages, however, we move only valid differentials into a 
new differential page, i.e., we do compaction here. Our method requires fewer write operations than 
page-based or log-based ones do because it satisfies the writing-difference-only and at-most-one-page 
writing principles. Thus, our method invokes garbage collection less frequently than other methods do. 

Figure [5] shows an example of PDL. Here, we have base_page(p), differentiaLpage(p), and differ- 
ential(p) for the logical page p. Figure [5](a) shows the logical pages li and I2 in memory. Figure [5](b) 
shows the updated logical pages h and I2, and the process of writing them into flash memory. When li 
and I2 need to be reflected into flash memory, we perform the following three steps: (1) read the base 
pages pi and P2 from flash memory; (2) create differential(^i) and differential (^2) by comparing li and I2 
with the base pages pi and p2, respectively; (3) write differential(/i) and differential(Z2) into the write 
buffer, which is subsequently written into the physical page ps when the buffer is full. We note that h 
and I2 from different logical pages are written into the same differential page p3 . Figure [5](c) shows the 
logical page h recreated from flash memory by merging the base page pi with differential(^i) in p^i.. 

^ Conceptually, we require an assembly buffer in order to merge the base page with the differential. But, in practice, 
we can use the logical page itself as the assembly buffer. 
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(c) The logical page li recreated from flash memory. 

Figure 5. An example of the differential-based approach. 

4.2 Data Structures 



The data structures used in flash memory are base pages, differential pages, and differentials. A base 
page stores a logical page in its data area and stores the page's type, physical page ID, and creation 
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time stamp in its spare area. Here, the type indicates whether the page is a base one or differential one, 
and the physical page ID represents the unique identifier of a page in the database. The creation time 
stamp indicates when the base page was created. 

A differential page stores differentials of logical pages in its data area and stores the page's type in 
its spare area. A physical page ID and a creation time stamp are stored also in a differential to identify 
the base page to which the differential belongs and when the differential was created. Therefore, the 
structure of a differential is in the form of < physical page ID, creation time stamp, [ offset, length, 
changed data]^>. 

The three data structures used in memory are the physical page mapping table, the valid differential 
count table, and the differential write buffer. The physical page mapping table maps a physical page ID 
into < base page address, differential page address >. This table is used to indirectly reference a base 
and differential page pair in flash memory because, in flash memory, the positions of the physical pages 
can be changed by the out-place scheme. 

The valid differential count table counts the number of valid differentials (i.e., those that have not 
been obsoleted) in a differential page. When the count becomes 0, the differential page is set to obsolete 
and made available for garbage collection. 

The differential write buffer is used to collect differentials of logical pages into memory and later 
write them into a differential page in flash memory when it is full. The differential write buffer consists 
of a single page, and thus, the memory usage is negligible. Figure [6] shows the data structures for PDL. 

4.3 Algorithms 

In this section, we present the algorithms for writing a logical page into flash memory and for recreating 
a logical page from flash memory. We call them PDL_ Writing and PDL_Reading, respectively. 

Figure [7] shows the algorithm PDL_ Writing. The inputs to the algorithm are the logical page p 
and its physical page ID pid. The algorithm consists of the following three steps. In Step 1, we read 
base_page(pid) from flash memory. In Step 2, we create differential(pid) by comparing base_page(pid) 
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Figure 6. The data structures for PDL. 



with p given as an input. In Step 3, we write differential(pid) into the differential write buffer. If old 
differential(pi(i) resides in the buffer, we first remove the old one, and then, write the new one. Here, 
there are three cases according to the size of differential(pi(i). First, when the size of differential(pic?) 
is equal to or smaller than the free space of the buffer (Casel), we just write differential(pi(i) into the 
buffer. Second, when it is larger than the free space of the buffer but is equal to or smaller than 
Max_DifferentiaLSiz^{Ca,se2) , we execute the procedure writingDifferentialWriteBuffer( ) in Figure [51 
clear the buffer, and then, write differential(pid) into the buffer. Here, Max_DifferentiaLSize is defined 
as the the maximum size of differentials to be stored in differential pages. The procedure writingDif- 
ferentialWriteBuffer( ) consists of the following two steps. In Step 1, we write the buffer's contents into 
the differential page q that is newly allocated in flash memory. In Step 2, we update the physical page 
mapping table ppmt and the valid differential count table vdct. For each differential d in the buffer, we 



* In Section 14.11 for ease of exposition, we have explained PDL on the assumption that Max^DijferentiaLSize = the 
size of one physical page. However, in practice, we can adjust it according to the workload. We will show the performance 
while varying Max^DijJerentiaLSize later in the experiment section (Section [Sjl . 
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decrement the count for the old differential page dp in vdct by executing the procedure decreaseValid- 
DifferentialCount( ). Here, if the count becomes 0, we set the differential page to obsolete|3 and make 
it available for garbage collection. We then set differentiaLpage(]5irf_rf) in ppmt to the new differential 
page q and increment the count for q in vdct. Here, pid-d is the physical page ID of the base page to 
which the differential d belongs. Third, when it is larger than Max-DijJerentiaLSize (Case 3), we discard 
differential(pirf) and execute the procedure writingNewBasePage( ) in Figure [H The procedure consists 
of the following two steps. In Stepl, we write the logical page p itself into the base page q that is 
newly allocated in flash memory. In Step 2, we update ppmt and vdct. We set the old base page bp to 
obsolete making it available for garbage collection. We then decrement the count for the old differential 
page dp in vdct by executing the procedure decreaseValidDifferentialCount( ) and set base_page(pid) 
and differential-page (pid) in ppmt to q and null, respectively. Figure [8] shows the procedures for the 
PDL_Writing algorithm. 



Algorithm PDL_Writing: 

Inputs: (1) p I* updated logical page */ 

(2) pid I* physical page ID of p */ 
Algorithm: 

/* Step 1 . Reading the base page by looking up the physical page mapping table ppmt *l 

bp := ppmt(pid}.base _page; 

Read bp from flash memory; 

/* Step 2. Creating a differential */ 

Create differential(pirf) by comparing bp read from flash memory with 

the updated logical page p given as an input; 
/* Step 3. Writing the differential into the differential write buffer dwb */ 
IF old differential(/7(rf) resides in dwb THEN 

Remove old differential(/7irf); 
END /* IF */ 

IF the size of differential(/)irf) < free space of dwb THEN /* Case 1 */ 

Write differential(/7^(^) into dwb; 
ELSE IF the size of differential(pid) > free space of dwb AND 

the size of differential(/)irf) < Max_Dijferential_Size THEN /* Case 2 */ 

Call writingDifferentialWriteBuffer( ); 

Clear dwb; 

Write differential(/7^(^) into dwb; 
ELSE IF the size of differential(/3!d) > Max_Differenhal_Sk,e THEN /* Case 3 */ 

Discard differential(pirf); 

Call whtingNewBasePagef j; 
END /* IF */ 



Figure 7. Writing a logical page into flash memory in PDL. 



^ For the spare area in a page, a write operation that changes a sot of bits from 1 to can be repeatedly performed up 
to four times without an erase operation [B]. 
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Procedure writingDifferentialWriteBuffer( ): 
Input: dwb I* differential write buffer */ 
Algorithm: 

/* Step 1 . Writing dwb into flash memory as a differential page */ 

Write its contents into the physical page q that is newly allocated in flash memory; 

/* Step 2. Updating the physical page mapping table ppmt and the valid differential count table vdct */ 

FOR EACH differential d in dwb DO 

BEGIN 

pid_d := physical page ID of the base page to which the differential d belongs; 
dp ppmt(pid_d).dijferential _page; 

IF dp ^ null THEN /* if the differential page already exists */ 

Call decreaseValidDifferentialCount(dp); I* decrement the valid differential count for dp */ 
END /* IF */ 

ppmt(pid_d).dijferential _page := q; /* set the differential page containing d to the new 

differential page q */ 

vdct(q). count := vdct(q). count + 1; /* increment the valid differential count for q */ 
END /* FOR */ 

Procedure decreaseValidDifferentialCount( ): 
Input: dp I* differential page */ 
Algorithm: 

vdct(dp). count := vdct(dp}.count - 1; /* decrement the valid differential count for dp */ 
IF vdct(dp).count = THEN 

Set dp to obsolete; 
END /* IF */ 

Procedure writingNewBasePage( ): 
Inputs: (1) p I* logical page */ 

(2) pid I* physical page ID of p */ 
Algorithm: 

/* Step 1 . Writing p into flash memory as a new base page */ 

Write p into the physical page q that is newly allocated in flash memory; 

/* Step 2. Updating the physical page mapping table ppmt and the valid differential count table vdct */ 

bp := ppmt(pid).base _page; 

dp := ppmt(pid). differential _page; 

Set bp to obsolete; 

IF dp # null THEN 

Call decreaseValidDifferentialCount(dp); /* decrement the valid differential count for dp */ 
END /* IF */ 

ppmt(pid).base _page := q; I* set the base page for the logical page p to the new base page q */ 
ppmt(pid). differential _page := null; I* set the differential page forp to nidi */ 



Figure 8. The procedures for the PDL_Writing algorithm in Figure [T] 

Figure [S] shows the algorithm PDL_Reading. The input to PDL_Reading is the physical page ID 
pid of the logical page to read. The algorithm consists of the following three steps. In Step 1, we read 
base_page(pid) from flash memory. In Step 2, we find differential(pic!) of the base_page(pid). Here, there 
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are two cases depending on the place where the differential(;)i(Q resides. First, when the differential(pit/) 
resides in the differential write buffer, i.e., when the buffer has not been yet written out to ffash memory, 
we find it from the buffer. Second, when we cannot find it from the buffer, we read differentiaLpage(pi(i) 
from flash memory, finding differential(^)i(i) from it. In Step 3, we recreate a logical page p by merging 
base_page(pi(i) read in Step 1 with differential (pid) found in Step 2. 



Algorithm PDL Reading 
Input: pid /* physical page ID */ 
Output: p /* logical page */ 
Algorithm: 

/* Step 1. Reading the base page by looking up the physical page mapping table ppmt *l 

bp \— ppmt(pid).base _page\ 
Read bp from flash memory; 

/* Step 2. Finding the differential */ 

IF differential(p;d) resides in the differential write buffer THEN 
Find differential(piVf) from the buffer; 

ELSE 

dp := ppmt(pid).differential _page; 
IFdp?s«M//THEN 

Read dp from flash memory; 

Find differential(p(£/) from dp read from flash memory; 

ELSE 

Return bp as the result p; /* there is no differential page */ 
END /* IF */ 

END /* IF */ 

/* Step 3. Merging the base page with the differential */ 
Merge bp with differential(p!<i) to make p; 
Retump; 



Figure 9. Recreating a logical page from flash memory in PDL. 



4.4 Discussions 

PDL has the following four advantages. (1) As compared with the page-based methods, it has good 
write performance, i.e., it requires fewer write operations, when we need to reflect an updated logical 
page into flash memory. This is due to the writing-difference-only principle. (2) As compared with 
the log-based methods, it has good write performance when a logical page is updated multiple times. 
This is due to the at-most-one-page writing principle. (3) As compared with the log-based methods, 
it has good read performance when recreating a logical page from flash memory. This is due to the 



19 



at-most-two-page reading principle. (4) Moreover, it allows existing disk-based DBMSs to be reused 
without modification as flash-based DBMSs because it is DBMS-independent. 

Figure [To] shows the DBMS architecture that uses flash memory as a secondary storage. The log- 
based methods need to modify the storage management module of the DBMS so as to write the update 
log whenever the page is updated as shown in Figure [TU]( a). On the other hand, PDL does not need to 
modify the DBMS but to modify only the flash memory driver^ because it computes the differential 
by comparing the whole updated logical page with its base page. Thus, it can be implemented inside 
the flash memory driver as shown in Figure [TO](b) without affecting the storage manager of the existing 
DBMS. 



an existing disk-based DBMS 
the log-based method 



an existing disk-based DBMS 



n 



n 



flash memory driver 



flash memory driver 



the page-differential logging method 



n 



flash memory 



flash memory 



(a) The log-based methods. 



(b) page-differential logging. 



Figure 10. The DBMS architecture that uses flash memory as a secondary storage. 

PDL, however, has the following minor drawbacks. First, when recreating a logical page from 
flash memory, PDL has to read one more page than page-based methods do. However, this drawback 
is relatively minor because the speed of read operation is much faster than that of write or erase 
operations. Furthermore, if a database is used for read-only access, PDL reads only one physical page 
just like page-based methods since a differential page does not exist (i.e., the base page has not been 
updated). Thus, in this case, the read performance of PDL is as good as that of the page-based methods. 
Second, the data size written into flash memory in PDL could be larger than that in log-based methods. 
It is because the differential contains all the difference between an updated logical page and its base 



''This flash memory driver corresponds to the FTL shown in Figure [2] 
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page, while the update log in the log-based methods contains only the difference between an updated 
logical page and its immediate previous version. However, in spite of this drawback, PDL improves the 
overall performance significantly because the advantages outweigh these drawbacks. We will show the 
performance advantages later in the experiment section (Section [5]). Table [2] summarizes the differences 
between PDL and the log-based ones. 



Table 2. Comparison of PDL with log-based and page-based ones. 





PDL 


log-based methods 


page-based methods 


data to be written 
into flash memory 


differential 


an update log 
(changed parts only) 


the whole page 
(changed and 
unchanged parts) 


time for writing data 
into the write buffer 


only when a logical page 
needs to be reflected 
into flash memory 


whenever a page is 
updated 


no write buffer 


time for writing data 
into flash memory 


when the write buffer is full 


when a page needs 

to be reflected 
into flash memory 


number of physical 
pages to read when 
recreating a logical page 


maximum two pages 
(1 < n < 2) 


multiple pages 


one page 


architecture 


loosely-coupled 
(DBMS-independent) 


tightly-coupled 
(DBMS-dependent) 


loosely-coupled 
(DBMS-independent) 



4.5 Crash Recovery 

A storage device with a cache normally supports a write-through command that flushes the data written 
into the cache immediately out to the device. When the write-through command is called, PDL flushes 
the differential write buffer out into flash memory. In flash memory, the page writing is guaranteed to 
be atomic at the chip level [9] . 

When a system failure occurs, we lose the physical page mapping table and the valid differential 
count table in memory. However, by one scan through physical pages in flash memory, we can reconstruct 
those tables. Here, the tables are recovered to the state in which data were reflected into flash memory 
by the write-through call or by flushing the differential write buffer. That is, the data retained in the 
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write buffer only but not written out to flash memory are not recovered in the tables. This is analogous 
to the situation where data retained only in the file buffer but not written out to disk in a disk file system 
are not recovered after a system failure. Thus, when persistency of data is required, a write-through 
call must be used. 

If a system failure occurs when a base page (or the differential write buffer) is written into flash 
memory, but the old base page (or the differential page that does not contain any valid differential) 
has not yet been set to obsolete in Figure [71 the new base page (or differential page) and the old base 
page (or differential page) might co-exist in flash memory. Thus, to identify the most up-to-date base 
page (or differential page), we use the creation time stamp stored in a base page and in each differential 
in a differential page as in Chang et al. [5] . 

Figure [11] shows the algorithm for reconstructing the physical page mapping table ppmt and the 
valid differential count table vdct. For every physical page r in flash memory, we read the spare area of 
r and update ppmt and vdct only if r is not obsolete. Here, there are two cases according to the type 
of r. First, when r is a base page (Case 1), we check whether ts(r) is more recent than ts{bp), where 
ts(r) is the creation time stamp of r and ts{bp) is that of the base page bp currently in ppmt. If so, 
r must be a more recent base page. Thus, we set base_page(pi(i) to r and set the old base page bp to 
obsolete, where pid is the physical page ID of r. We then check whether ts(r) is more recent than ts{dp, 
differential(pirf)), which is the time stamp of diffcrGntial(pirf) in the differential page dp currently in ppmt. 
If so, the diffcrGntial(pirf) must be obsolete since we have a base page r that is more recent. Thus, we set 
differential-page (pirf) to null and decrement the count for the old differential page dp by executing the 
procedure decrease ValidDifferentialCount( ). If ts(r) is not more recent than ts(6p), we set r to obsolete. 
Second, when r is a differential page (Case 2), we read the data area of r. For each differential d in r, 
we check whether ts(d) is more recent than both ts{bp) and ts{dp, differential(pi(i_(i)), where ts{d) is the 
time stamp of d, is{bp) is that of the base page bp currently in ppmt, and ts{dp, differential(pi(i_rf)) is 
that of differential(pi(i_cO in the differential page dp currently in ppmt. Here, pid-d is the physical page 
ID of the base page to which the differential d belongs. If so, d must be a more recent differential of 
bp than differential(pi(i_(i) currently in ppmt. Thus, we set differentiaLpage(pi(i_(i) to r, decrement the 
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count for the old differential page dp by executing the procedure decreaseValidDifferentialCount(), and 
increment the count for the new differential page r. If r does not contain any valid differential after 
processing all the differentials in r, we set r to obsolete. 



Algorithm PDL_RecoveringfromCrash 

I* Reconstructing the physical page mapping table ppmt and the valid differential count table vdct *l 
Initialize ppmt and vdct; 

FOR EACH physical page r in flash memory DO 
BEGIN 

Read the spare area of r from flash memory; 
IF IS_OBSOLETE_PAGE(r) THEN 

CONTINUE; 
END /* IF */ 

IF lS_BASE_PAGE(r) THEN /* Case 1: r is a base page */ 
pid := physical page ID of r; 
bp := ppmt(pid).base _page; 
dp := ppmt(pid).differential _page; 
I* ts(x, y) returns the creation time stamp as follows: 

(1) if a: is a base page or a differential, returns the time stamp of x (here, y can be omitted) 

(2) if ;c is a differential page, returns the time stamp of differential yinx*/ 
IF ts( r) > ts(bp) THEN /* r is a more recent base page */ 

Set bp to obsolete; 

ppmt( pid). base _page :- r; I* set the base page with pid to the new base page r */ 

IF ts(r) > ts(dp, differential(p;</)) THEN /* r is more recent than differential(pi<f) in dp */ 

Call decreaseValidDifferentialCount(dp); I* decrement the valid differential count for dp */ 
ppmt(pid).differential _j>age := null; I* set the differential page containing differential(p!<ij to null */ 
END /* IF */ 
ELSE /* bp is a more recent base page */ 

Set r to obsolete; 
END /* IF */ 
ELSE /* Case 2: r is a differential page */ 

Read the data area of r from flash memory; 
FOR EACH differential d in r DO 
BEGIN 

pid_d := physical page ID of the base page to which the diffemtial d belongs; 

bp := ppmt(pid_d).base _page; 

dp := ppmt(pid_d).differential_page; 

IF ts(d) > ts(bp) AND ts(d) > ts(dp,difk^™tial(pid_d)) THEN /* d is more recent than bp and differential(p(£/_£/) in dp */ 
Call decreaseValidDifferentialCount(dp); I* decrement the valid differential count for dp */ 
ppmt(pid_d).dijferential jage := r; I* set the differential page containing d to the new differential page r *l 
vdct(r).count \- vdct(r).count +1; /* increment the valid differential count for r */ 

END /* IF */ 
END /* FOR */ 

IF vdct(r). count = THEN /* r does not contain any valid differential */ 

Set r to obsolete; 
END /* IF */ 
END /* IF */ 
END /* FOR EACH */ 



Figure 11. The algorithm for reconstructing the physical page mapping table and the valid differential 
count table upon system failure. 
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In Figure fTT| we set two kinds of useless pages to obsolete: (1) base pages that are not recent but 
have not been set to obsolete and (2) differential pages that do not contain valid differential but have 
not been set to obsolete. These pages can occur in flash memory when a system failure occurs if a base 
page (or the differential write buffer) has been written into flash memory, but the old base page (or the 
differential page that does not contain valid differentials) has not yet been set to obsolete. 

The algorithm PDL_RecoveringfromCrash guarantees that recovery is normally performed even 
when a system failure repeatedly occurs during the process of restarting the system. The reason is that 
the algorithm does not change data in the flash memory except setting the useless pages (i.e., the pages 
that are no longer used, but have not been set to obsolete) to obsolete. Setting useless pages to obsolete 
does not affect the recovery process of reconstructing the physical page mapping table and the valid 
differential count table. 

Since scanning the entire flash memory of 1 Gbytes takes approximately 60 seconds (derived from 
Table [T] in Section [2]), the scan time can be practically accommodated. To recover the physical page 
mapping table without scanning all the physical pages in flash memory, we have to log the changes in 
the mapping table into flash memory. We leave this extension as a further study. 

We note that we can implement the proposed PDL and recovery techniques in a DBMS that uses 
flash memory to support transactional database recovery just as we do in a DBMS built on top of an 0/S 
file system by using the write-through facility whenever persistency of a write operation is required (e.g., 
when writing the 'transaction commit' log record). 

5 Performance Evaluation 

5.1 Experimental Data and Environment 

We compare the data access performance of PDL proposed in this paper with those of the page-based 
and log-based methods discussed in Section [31 We use the wall clock time taken to access data from 
flash memory (we call it the I/O time) as the measure. Here, as the page-based method, we use the 
one employing the out-place update (OPU) scheme with the page-level mapping technique, which is 
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known to have good performance even though the method consumes memory excessively [9] . We also 
compare with the in-place update method (IPU). As the log-based method, we use the in-page logging 
method (IPL) proposed by Lee and Moon [14]. 

We use the synthetic relational data of 1 Gbytcs and update operations for comparing data access 
performance of the three methods. We define an update operation as consisting of the following three 
steps: (1) reading the addressed page; (2) changing the data in the page; and (3) writing the updated 
page. The reading step (1) creates a logical page by reading physical pages from flash memory, and 
the writing step (3) writes the updated logical page as one or more physical pages into flash memory. 
The experiments are designed this way to exclude the buffering effect in the DBMS. Therefore, we can 
measure read, write as well as overall performance by executing only update operations. 

The I/O time is affected by N jupdatesJtill -write and %ChangedByOneU JDp. Here, N jupdatesJtill -write 
is the number of update operations applied to a logical page in memory from the time it is recreated 
from flash memory until the time it is reflected back into flash memory, %ChangedByOneU -Op is the 
percentage of data changed in a logical page by a single update operation. Here, the portion of data to 
be changed is randomly selected. We also compare the performance of various mixes of read-only and 
update operations varying the percentage of the update operations {%UpdateOps) . Besides, we measure 
the performance as we vary the performance parameters of flash memory (i.e., the I/O times for read 
and write operations in Table [1]). We also compare the longevity of flash memory. Finally, we perform 
the TPC-C benchmark |20| as a real workload. Table [3] summarizes the experiments and parameters. 

In each experiment, garbage collection is invoked whenever there is no more free page in flash 
memory!^. Here, the cost (time) of garbage collection is amortized into that of the write operation 
because garbage collection is incurred by the accumulated effect of write operations. We repeatedly 
execute experiments so that garbage collection is invoked for each block at least ten times on the 
average after loading the database in order to make the database to reach a steady state. 



In IPL, garbage collection is invoked during the process of merging. 
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Table 3. Experiments and parameters. 



Experiments 


Parameters 


Exp. 1 


Read, write, and overall time per 
update operation 


%C hangedByOneU -Op 


2 


N -Updates Jtill -Write 


1 


Exp. 2 


Overall time per update operation 
as N japdatesJbill -Write is varied 


YoC hangedByOneU -Op 


2 


N -Updates -till -Write 


1-8 


Exp. 3 


Overall time per update operation 
as %ChangedByOneU-Op is varied 


YoC hangedByOneU -Op 


0.1 ^ 100 


N -Updates -till -Write 


1, 5 


Exp. 4 


Overall time per operation for the mixes 
of read-onlv and update operations 
as YoUpdateOps is varied 


%C hangedByOneU -Op 


2 


N -Updates Jtill -Write 


1, 5 


%UpdateOps 


- 100 


Exp. 5 


Overall time per update operation as 
the parameters of flash memory are varied 


%C hangedByOneU -Op 


2 


N -Updates -till -Write 


1 


Thread 


10 - 1500 




500, 1000 


Exp. 6 


Number of erase operations per update 
operation as N jupdatesJtill -write is varied 


%C' hangedByOneU -Op 


2 


N -Updates -till -Write 


1 - 8 


Exp. 7 


1/0 time per transaction for TPC-C data 
as the DBMS buffer size is varied 


DBMS buffer size 


1 - 100 Mbytes 
(0.1 ~ 10% of 
database size) 



For the experiments, we have implemented an emulator of a 2-Gbyte flash memory chip using the 



parameters show n in Table [T 
and IPL (y) 
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We also have implemented the four methods: PDL(2:), OPU, IPU, 
for PDL and OPU. Here, x is Max-DijferentialSize (defined in Section l473l in p. 17), 
and y is the amount of log pages in each block. We used the Odysseus ORDBMS [23l[24] as the storage 
system. Here, PDL, OPU, and IPU are implemented outside the DBMS, and IPL inside the DBMS. 
We conducted all experiments on a Pentium 4 3.0 GHz Linux PC with 2 Gbytes of main memory. We 
set the size of a logical page to be 2 Kbytes, which is the size of a physical page in flash memory. We 
also test the case with a logical page of 8 Kbytes as was done by Lee and Moon [14] . 

For each operation, the emulator returns the required time in the flash memory, which is specified in Table [T] while 
writing and reading the data to and from the disk. The data are in exactly the same format in disk as would be stored 
in flash memory. Thus, access time using the emulator must be identical to that using the real flash memory. 

^■'We set the size of log buffer for each logical page to the size of a logical page x-^ as was used by Lee and Moon |14| . 

We do not use wear-leveling in this paper, but the same wear-leveling techniques can be applied to these methods. 
We use the same garbage collection method suggested by Woodhouse [21] 
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5.2 Results of the Experiments 
Experiment 1: 

Figure fT2l shows the read, write, and overall time per update operation for the six methods: IPL (18KB), 
IPL(64KB), PDL(2KB), PDL(256B), OPU, and IPU. For lPL{y), we have varied y from SKbytes 
to 64 Kbytes. Among them, we select IPL (18KB) and IPL (64KB) because they have the best and 
worst overall time for update operations, respectively. For PDL, we select PDL (2KB) and PDL (256B) 
because the amounts of differential pages in them are similar to those of log pages in IPL (64KB) and 
IPL (18KB), respectively Specifically, IPL (64KB) and PDL (2KB) use 50 % of flash memory for storing 
log/differential pages. IPL (18KB) and PDL (256B) use 14.1 % and 11.1 % of flash memory, respectively. 

Figure [T^fa) shows that the I/O time of the reading step per update operation is in the following 
order: IPL (64KB), IPL (18KB), PDL (2KB) / PDL(256B), and OPU/IPU. This resuh is consistent 
with what was discussed in Sections [3] and ID OPU and IPU require one read operation. PDL requires 
at most twice as many read operations. IPL requires multiple read operations. We note that, when we 
perform read-only operations, we can also achieve the same result as is shown in Figure fT2l (a). 

Figure [12] (b) shows that the I/O time of the writing step is in the following order: IPU, OPU, 
PDL (2KB), IPL (18KB), IPL (64KB), and PDL (256B). Here, the slashed area indicates the I/O time for 
garbage collection. The result is also consistent with the discussions in Sections [3] and ID For an update 
operation, OPU requires two write operations: one for writing the updated page into flash memory 
and another for setting the original page to obsolete. However, IPL requires only one write operation 
for writing the log buffer into flash memory. PDL (2KB) requires two write operations approximately 
for every two update operations: one for writing the differential write buffer into flash memory and 
another for setting one (on the average) differential page to obsolete^ because the size of a differential 
is approximately half a page on the averagefl Thus, PDL (2KB) requires approximately one write 
operation for an update operation on the average. PDL (256B) requires a less number of write operations 
than PDL (2KB) does since the differential write buffer is filled less frequently. But, PDL additionally 

^^When the count of valid differentials in vdct becomes 0, we set the differential page to obsolete. 

Since the size of a differential changes from to 1 page size and back to (Case 3 in Figure [7]l as updating a logical 
page is repeated, the size of a differential in a steady state is approximately half a page on the average. 
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□ read time ■ write time (including erase time) 




PDL(2KB)PDL(256B) OPU IPU IPL(18KB)IPL(64KB) 

(a) The I/O time of the reading step. 
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(b) The I/O time of the writing step. Slashed parts indicate the 
time for garbage collection. Lighter areas represent read time. 
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(c) The overall time per update operation including read and write times in (a) and (b). 

Figure 12. The read, write, and overall time per update operation {N_updates_till_write 
%ChangedByOneU JDp — 2, database size = 1 Gbytes, Tread — HO /is, T^jj-ite — 1010 /xs). 
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requires one read operation for reading the base page in from flash memory in order to create the 
differential. Here, each method includes a certain amount of read cost, which is incurred by garbage 
collection and amortized into the write cost. We note that PDL (256B) outperforms the other methods 
due to less frequent writing of the differential write buffer. 

Figure [T2](c) shows the overall time per update operation combining the I/O times shown in 
Figures [T2](a) and (b). PDL (256B) has good read and write performance as shown in Figures [T2](a) 
and (b), and thus, has the best overall time for an update operation. (This corresponds to Figure [T3l(a) 
when N ^updates dill -Write — 1.) 

Experiment 2: 

Figure [13] shows the overall time per update operation of the six methods as N_updates-tilljwrite 
is varied. First, the I/O time of OPU and IPU is steady regardless of the parameter because they 
always write the whole page when reflecting an updated logical page into flash memory. Next, the I/O 
time of IPL increases in a stepwise manner. The reason for this behavior is that the number of write 
operations for a logical page is computed as [ ^t^e ''size °of log'^buffer '\ ■ Here, the size of the update logs to 
be written increases linearly as N _updatesdilljwrite increases because IPL keeps all the update logs of 
a logical page. (We note that this process of writing is not bound by merging while the reading process 
is.) Finally, the I/O time of PDL (2KB) increases only very slightly as N-updatesJilLwrite increases 
because the size of the overlap among the changed parts becomes larger as N japdates -till -write increases 
with the total size of the difference being limited to one page. The I/O time of PDL (256B) increases 
approximately linearly as N -updates -till -write increases because the size of the overlap is small. As 
N -Updates -till -Write increases, the I/O time of PDL (256B) approaches that of OPU because the logical 
page itself (rather than the differential) is written into flash memory as the size of the differential becomes 
larger than Max-DifferentialSize (Case 3 in Figure[7]). As a result, PDL (256B) outperforms OPU, IPU, 
and IPL. The result when the size of a logical page is 8 Kbytes shows a tendency similar to that when 
the size of a logical page is 2 Kbytes. 
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(a) size of a logical page — 2 Kbytes. 
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(b) size of a logical page = 8 Kbytes. 



Figure 13. The overall time per update operation as N ^updates dill -write is varied 
{%ChangedByOneU .Op = 2). 

Experiment 3: 

Figure [14] shows the overall time per update operation for the six methods as %ChangedByOneU JDp 
is varied. The result is consistent with what we observed in Figure [131 We note that PDL (256B) 
outperforms OPU, IPU, and IPL for the same reason as in Figure [T31 When %ChangedByOneU JDp « 
100, the I/O time of PDL (2KB) is slightly larger than that of OPU because, while the two methods 
require the same number of write operations, PDL (2KB) needs three times as many read operations — 
for reading the base page and the differential page when recreating a logical page from flash memory, 
and then, for reading the base page again to create the differential when reflecting the updated logical 
page into flash memory. 



Experiment 4: 

Figure [TS] shows the results of Experiment 4. When updates are rare (i.e., %UpdateOps « 0), OPU 
outperforms PDL and IPL (see Figure [T2](a)). As %UpdateOps increases, PDL becomes superior to 
OPU because of its superiority in update performance (see Figure [T2l(c)). We also note that PDL 
always outperforms IPL. In summary, for various mixes of read-only and update operations, PDL (256B) 
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Figure 14. The overall time per update operation as %ChangedByOneU-Op is varied 
{N -Updates -till -Write = 1,5). 

improves performance by 0.5 ^ 3.4 times over OPU and by 1.6 ^ 3.1 times over IPL (18KB) and by 2.0 
^ 9.7 times over IPL (64KB). We note that the case of 0.5 times over OPU is the special case where all 
transactions are read-only (i.e., %UpdateOps — 0). 
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Figure 15. The overall time per operation for the mixes of read-only and update operations as 
%UpdateOps is varied {%ChangedByOneU-Op — 2). 
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Experiment 5: 

Figure [1^] shows the overall time per update operation as the Tread and T^rite parameters of flash 
memory are varied. We observe that PDL (256B) always outperforms OPU and IPL. As the read 
time [Tread) iucreases, OPU becomes superior to PDL (2KB) or IPL. We have this result because OPU 
has superiority in read performance (see Figure [T2( a)). We note that PDL (256B) outperforms OPU 
and IPL regardless of the Tread and T^rite parameters of flash memory. 



-■- PDL(2KB) -B- PDL(256B) -A- OPU -•- IPU 1 8KB)^^- IPL(64KB) 




10100 300 500 700 900 1100 1300 1500 10100 300 500 700 900 1100 1300 1500 
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^ read' ^ ' ^ read' ^ ' 

(a) T^rite = 500 /.ts. (b) T„„te = 1000 /xs. 

Figure 16. The overall time per update operation as the Tread and Twrite parameters of flash memory 
are varied [N jupdatesJtill -write — l,%ChangedByOneU-Op = 2,Terase = 1500 /xs). 

Experiment 6: 

Figure [T71 shows the number of erase operations per update operation as N japdatesJtill -write is varied. 
We observe that, when N -updates -till -write — 1, the number of erase operations per update operation is 
in the following order: OPU, PDL (2KB), IPL (18KB), PDL (256B), and IPL (64KB). Thus, IPL (64KB) 
has the best longevity among the five methods. But, it has poor performance for the mixes of read-only 
and update operations as shown in Figure [15] PDL (256B) has good longevity next to IPL (64KB). 
Besides, it has significantly good performance for the mixes of read-only and update operations. 
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Figure 17. The number of erase operations per update operation as N _updatesJtill_write is varied 
{%ChangedByOneU.Op = 2). 

Experiment 7: 

Figure [TSl shows the results of the TPC-C benchmark. We observe that the I/O time is in the follow- 
ing order: IPL(64KB), IPL(18KB), OPU, PDL(2KB), and PDL(256B). The result shows that PDL 
outperforms other methods in real workloads as well. 

I -■- PDL(2KB) -B- PDL(256B) -A- OPU -♦- IPL( 1 8KB)-S- IPL(64KB) I 
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1 5 10 50 100 

Buffer size (Mbytes) 

Figure 18. TPC-C benchmark: I/O time per transaction as the DBMS buffer size is varied (database 
size = 1 Gbytes) . 
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6 Conclusions 

We have proposed a novel approach for storing data cahed page-differential logging for flash-based storage 
systems. We have defined the notion of the differential and presented the algorithms for reading and 
writing pages into fiash memory using the differential. 

We have identified three design principles: writing-difference-only, at-most-one-page writing, and 
at-most-two-page reading. These principles guarantee good performance for both read and write oper- 
ations. We have shown that our method conforms to these principles. 

Page-differential logging is DBMS-independent, i.e., it allows existing disk-based DBMSs to be 
reused as fiash-based DBMSs just by modifying the fiash memory driver. In addition, it improves the 
longevity of fiash memory by reducing the number of erase operations compared with existing page-based 
methods. 

We have performed extensive experiments to compare the performance of page-differential logging 
with existing page-update methods. Through these experiments, we have shown that the performance of 
our method is superior to those of page-based and log-based methods — except when all transactions are 
read-only on already updated pages. We also performed experiments as the performance figures of read 
and write operations change. The results show that our method (in particular, PDL(256B)) is always 
superior to other methods. Thus, the results indicate that page-differential logging can be the preferred 
technique for commercial products^. We also performed experiments to compare various methods for 
the longevity of fiash memory. The results show that our method (in particular, PDL (256B)) improves 
the longevity of fiash memory compared with OPU and IPL (18KB). Finally, we performed the TPC- 
C benchmark as the DBMS buffer size is varied. The results show that our method (in particular, 
PDL(256B)) outperforms other methods by 1.2 ^ 6.1 times. This shows effectiveness of our method 
under real workloads. 

Currently, we are implementing page-differential logging on a fiash memory embedded board. Such 
an augmented board is to be incorporated to our Odysseus DBMS O |24]. The resuhing system wiU 

Commercial SSD's offer average write time comparable to read time by exploiting parallelism, but individual NAND 
flash chips typically have asymmetric read/write times. 
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facilitate various fiash-memory-dependent optimizations in various components of the DBMS such as 
the indexes, buffer, sort module, and query optimizer. We also note that, due to its DBMS-independent 
nature, page-differential logging can be employed by the manufacturer in the FTL of commercial SSD's. 
We leave these issues as future work. 
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